Event-based text mining for biology and functional genomics
نویسندگان
چکیده
The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.
منابع مشابه
Empowering web portal users with personalized text mining services
Fedor Bakalov1, Marie-Jean Meurs2,3, Birgitta König-Ries1, Bahar Sateli2, René Witte2, Greg Butler2,3, Adrian Tsang3,4 1Institute for Computer Science, Friedrich Schiller University of Jena, Jena, Germany 2Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada 3Centre for Structural and Functional Genomics, Concordia University, Montreal, Canada 4Depar...
متن کاملCritical Assessment of Information Extraction Systems in Biology
An increasing number of groups are now working in the area of text mining, focusing on a wide range of problems and applying both statistical and linguistic approaches. However, it is not possible to compare the different approaches, because there are no common standards or evaluation criteria; in addition, the various groups are addressing different problems, often using private datasets. As a...
متن کاملThe Value of an in-Domain Lexicon in genomics QA
This paper demonstrates that a large-scale lexicon tailored for the biology domain is effective in improving question analysis for genomics Question Answering (QA). We use the TREC Genomics Track data to evaluate the performance of different question analysis methods. It is hard to process textual information in biology, especially in molecular biology, due to a huge number of technical terms w...
متن کاملText Mining for Finding Functional Community of Related Genes Using TCM Knowledge
We present a novel text mining approach to uncover the functional gene relationships, maybe, temporal and spatial functional modular interaction networks, from MEDLINE in large scale. Other than the regular approaches, which only consider the reductionistic molecular biological knowledge in MEDLINE, we use TCM knowledge(e.g. Symptom Complex) and the 50,000 TCM bibliographic records to automatic...
متن کاملUsing Profile Matching and Text Categorization for Answer Extraction in TREC Genomics
TREC’06 genomics track was focusing on text mining and passage retrieval. WIM lab participated in this year’s TREC genomics track. Our system consists of five parts: preprocessing, sentence generation, document retrieval, answer extraction and answer fusion. And we developed two different method: a automated profile matchingbased method and a text categorizationbased method to do the text minin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 14 شماره
صفحات -
تاریخ انتشار 2015